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Abstract. Implicit-explicit (IMEX) Runge-Kutta methods play a major rule in the numerical 
treatment of differential systems governed by stiff and non-stiff terms. This paper discusses order 
conditions and symplecticity properties of a class of IMEX Runge-Kutta methods in the context 
of optimal control problems. The analysis of the schemes is based on the continuous optimality 
system. Using suitable transformations of the adjoint equation, order conditions up to order three 
are proven as well as the relation between adjoint schemes obtained through different transformations 
is investigated. Conditions for the IMEX Runge-Kutta methods to be symplectic are also derived. 
A numerical example illustrating the theoretical properties is presented. 
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1. Introduction. Recently, there has been intense research on the time dis- 
cretization of optimal control problems involving differential equations. Such meth- 
ods have found widespread applications in aerospace and mechanical engineering, the 
life sciences, and many other disciplines. In particular, properties of Runge-Kutta 
methods have been investigated for example in [ini 131 [HI [181 [El [3 [9] . Hager [T0] 
investigated order conditions (up to order four) for Runge-Kutta methods applied to 
optimality systems. This work has been later extended [31 [TB] and also properties 
of symplecticity of the scheme have been studied, see also [Bj. Further studies of 
discretizations of state and control constrained problems using Runge-Kutta meth- 
ods have been conducted in O [9] . The observations lead to the idea to extend also 
other schemes like W-methods to optimal control problems [18]. Further, automatic 
differentiation has been applied to Runge-Kutta discretizations 

In many practical application involving systems of differential equations of the 
form 

y'{t) = fiyit),t)+g{y{t),t), 

where / and g are eventually obtained as suitable finite-difference or finite-element 
approximations of spatial derivatives, the time scales induced by the two operators 
may be considerably different. Let us assume that / is the non-stiff term and g the 
stiff one. Although the problem is stiff as a whole, the use of fully implicit solvers 
originates a nonlinear system of equations involving also the non-stiff term / which 
quite often represent the most expensive/difficult term in the computation. Thus it is 
highly desirable to have a combination of implicit and explicit (IMEX) discretization 
terms to resolve stiff and non-stiff dynamics accordingly. For Runge-Kutta methods 
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such schemes have been studied in [U El |5l [71 HH |T7l EHl US] . Among the prominent 
examples are the numerical integration of hyperbolic conservation laws, convection- 
diffusion equations and singular perturbed problems. 

As discussed in [T71 [2D] the construction of such methods implies new difficulties 
due to the appearance of coupled order conditions and to the possible loss of accuracy 
close to stiff regimes. The present work is concerned with the use of implict-explicit 
methods in the context of optimal control problems. Here we focus our attention to 
the order condition of the adjoint IMEX system and its symplecticity property leaving 
to further research specific application to partial differential equations. We refer to 
I12j for examples of applications to hyperbolic problems. 

The general IMEX Runge-Kutta scheme is introduced in Section 2 as well as its 
discrete adjoint equations. A transformation of these equations is proposed in order 
to later on analyse order conditions and symplecticity properties. Since the presented 
transformation is different from the one used for example in (TUl [H] we also discuss 
the relation between the schemes obtained by using the different transformations. 
The existing relations are summarized in Figure [2T] We furthermore investigate the 
relation between the two possible approaches to derive the optimality system: we 
prove in Theorem 2.1 that discretize-then-optimize and optimize-then-discretize are 
equivalent. The order conditions up to order three are summarized in Theorem 3.1 
and the results on symplecticity are given in Theorem 3.2. A numerical example is 
presented in Section 4. Examples of IMEX Runge-Kutta schemes up to order three 
are reported in a separate appendix. 

2. IMEX Runge-Kutta methods for optimal control problems. 

2.1. The optimal control problem. We consider optimal control problems 



for ordinary differential equations of type (2.1 ): 



(OCP) min j(y(T)) such that 

yit) = fiyit),uit)) + g{y{t),uit)), 



te[0,T] 



(2.1a) 
(2.1b) 
(2.1c) 



Related to the optimal control problem we introduce the Hamiltonian function H 
as H{y,u,p) :— P"^(/(2/,it) + g{y,u)). Under appropriate conditions it is well-known 



[131 US] that the first -order optimality conditions for (2.1) are 



f{y,u) 



9{y,u), 



y= Hp{y,u,p) 

p = -Hy{y,u,p) = -Jy{y,uYp-gy{y,uyp, 
= Hu{y,u,p) ^ fu{y, ufp + gu{y, ufp. 



2/(0) = 2/° (2.2a) 
p(T) = j'{y{T)) (2.2b) 
(2.2c) 



The equation (2.2a) is called state equation and (2.2b) is called adjoint equation. 



We are interested in implicit-explicit Runge-Kutta (IMEX-RK) discretizations for 
(2.2a) and (2.2b I. To be more precise, we treat / by an explicit method and assume 



that g enjoys some stiffness so that an implicit method is required. Therefore, the 



discretization of the general case of (2.2a I leads to two different schemes for / and 
g, respectively. A corresponding Runge-Kutta discretization scheme [TH HD] with s 
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stages is given by 

s s 

Y^''> - Vn + /I E ^■ufiY.i''^ , <) + ^ E ,<) » = 1, .., s (2.3a) 

y„+i= y„ + /i^^,/(r«,<) + /i^c^,g(y«,<), n = 0,l,2,. (2.3b) 

i=l 1=1 

where A,A,uj,ui are the associated Runge-Kutta coefhcient matrices and the Runge- 
Kutta weights, respectively. We refer to methods where an exphcit scheme for / and 
an imphcit scheme for g is used as as IMEX-RK schemes. Their properties have been 
investigated for example in [TJ [Tni HO] ■ A particularly interesting subclass is the class 
of diagonally implicit IMEX-RK methods. 



Definition 2.1. We call the method (2.3) diagonally implicit IMEX-RK method, 
^ff '^ij — foT j ^ i cind a.ij — for all j > i. 

In order to simplify the notation in the sequel we do not truncate the correspond- 
ing sums and, if not stated otherwise, all following results are given for a general 
implicit-explicit methods. Only later we will reduce the investigation to the class of 
diagonally implicit methods. 

As in |3l [20l [19] we use an equivalent formulation of the IMEX-RK scheme in 
order to derive the discrete first-order optimality conditions. Instead of representation 



(2.3) we use the equivalent formulation (2.4) 



S S 

K^:^ = 5 I yn + E ^^^^^'^ + E ^ ' < 1 (2.4b) 

S S 

Vn+i = Vn + hJ2Cj,Kl:'> + /iE^»^"^- (2.4c) 

1=1 i=l 

Note that due to the two different schemes for / and g, respectively, we introduce two 
auxiliary variables and K'^'^\ These lead to additional discrete adjoint equations 
compared with the formulation in [3]. The associated discretized optimal control 



problem to (2.1 ) using IMEX-RK is hence given by 



{DOP) minj(yAr) such that (2.5a) 

if(') = / I y„ + /» E + E "^^^^'^ ' < I (2-5b) 

if« = g |2/„ + ;i^a,jX(^)+/i^a„if(j"),< I (2.5c) 

S S 

yn+l=yn + hY,^^K^'^ + hY,^^K^'\ Vo ^ ■ (2.5d) 

i=l 1=1 
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Clearly, the Lagrangian is 



n=0 



\ 1=1 1=1 



(2.6) 



i=l 



i=l 



where F^*' := ?/„ + hJ^'j^iO-ijKn^ + hJ2^j=iO-ijKn' . Here, the vectors and p 
are the Lagrange multipliers corresponding to the equality constraints given by the 



initial condition and system ( 2.4 1 , respectively. For the first order necessary optimality 



conditions for ( 2.5 1 we obtain the feasibility conditions given by ( 2.4 1 and furthermore 



the discrete adjoint equations which are derived upon differentiation of C{y,p,^,£,) 
with respect to Kn\ K^^ and respectively. The system of adjoint equations reads 



3 = 1 



3 = 1 



Pn 



= Pn+l +E fv^^''^<f^+Y.9v^^^<f^^ PN=j'{yN)i2.7c) 



where the index range for n is N — 1, .., and the intermediate adjoint states have to 



be computed for i = 1, .., s. The discretization method (2.7) is not yet in a standard 



RK notation. Similar to O [101112] we have the following result. 

Proposition 2.1. If we assume that Ui ^ and uji ^ then (2.1) can be 
rewritten as 



P(') ^Pn-hY, 5., /, (ri^) , KfP^^^ -hY, oi.,3 9y {Y^''^ , <fP^='^ (2.8a) 

S S 

P« = p„ -hYP^3 fy{Y^'\uifP^^^ ^h^Pn 9y{Y^'\<fP^=^ (2.8b) 



Pn+l = Pn 



= Pn-hY fy {y1'^ , <fP^'^ ~hY^, gy{Y}^^ , <)^P(^) , (2.8c) 



i=l 



where the coefficients Uij , aij , Pij and /3ij are given by 



&ii ■— Z^O-ji, CXij ■— iOj z^Uji, /3ij '■— UJj ^dji, Pij '■— 



UJi 



Proof. If cji 7^ and uJi ^ Q, then we can define new variables 



and Pi*' 



(i = l,..,s; n = 0,..,7V-l). (2.9) 
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Optimal 
Control 
Problem 



Discrete 
Problem 
using (SRKl) 



V.T. 



Discrete 
Problem 
using (SRK2) 



Discrete 
Problem 
using (RK) 



opt. 



opt. 



optimization 



Optimality 
System: 
(SRK1)+(ARK1') 



Optimality 
System: 
(SRK2)+(ARK1) 



optimization 



Continuous 
Optimality 
System 




Optimality 
System: 
(SRK1)+(ARK2) 



Optimal ity 
System 
(Hager,Bonnans) 



Fig. 2.1. Diagram summarizing the relations discussed. Here 
transformation", (RK) denotes the single RK method for ^2. 2a}^ c 
system (SRK2) denotes system l[2.4^ , (ARKl) denotes system \2. 

g[ ) and (ARKl ') denotes the discrete adjoint system ^2.11\ . 



V. T. " stands for "variable 
El \M, (SRKl ) denotes 
, (ARK2) denotes system 



We obtain (|2.8|) using the definition of Pn^ and Pn^ in fi*' and £}n \ respectively. U 



Remark 2.1. Referring to the classification of IMEX-RK methods given in f^, 
Proposition\2. 1\ is extended to IMEX schemes of type AR S with uji — 0. In this case, 



define Pn^ for i = 2, .., s and Pn^ for i — 1 



'2. 9 ) and use the transformation 
to obtain (2.8a) and l2.8h ) for i ~ l,..,s and i — 2,..,s, respectively, and for the 
further equation we set 



p(i 



1=1 



i.e. we use again {2.8b) and define the coefficients fiij :— ujj and Pij := ujj. The 



remaining coefficients are defined as in Proposition 2.1 



2.2. Discrete and continuous optimality systems. We prove the foUowing 
results on the relations dcpicitcd in Figure 2.1 The discrete optimality system of (2.51 



represents a RK discretization of the continuous optimality system (2.2). The system 



obtained by discretizing the continuous optimality system ( 2.1 ) and by optimizing the 
discretized optimal control problem coincide. 

Theorem 2.1. We consider the first-order necessary optimality conditions (2.2) 



of the optimal control problem (2.1) 



The equations (2.3) and (2.8) are the discrete state and adjoint equations of the 



optimality system to the problem minj'(yjv) subject to (2.3) 



and adjoint [2.2b) equation 



The equations [2.3) and [2.8) are a discretization of the continuous state (2.2a) 
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The proof of Theorem 2.1 fohows showing that system (2.3) yields a discretiza- 
tion scheme (2.11) for (2.2b) and the latter is transformed into (2.8) by a variable 
transformation of the intermediate states. 



Lemma 2.1. Given the optimal control problem (2.1) and the RK discretization 
(2.3) for the state equation [2.2a). Then, the associated RK discretization of the 
discrete adjoint equation is equivalent to (2.8). 

Proof. The associated discretized optimal control problem is given by 

(DOP,) mm J iyr,) (2.10a) 

S S 

y« = y„ + + hY,a.,g(Y,i^\ui), (2.10b) 

s s 

Vn+1 = Vn + /i^c^./(ri'\<) + /i^c.,.g(y«,<), yo = y". (2.10c) 

i=l i=l 

The stationary points of the Lagrangian yield for i — 1, . . . , s and n = 1, . . . , iV — 1 

T 



(2.11a) 



- ^ ~a,, A(r„«, O^C'-''^ + h a,, g,(ri'), O^C^^') (2.11b) 



Pn = P 



^Pn+l+YCi'\ PN^fiVN) 







Pn+l 



(2.11c) 
(2. lid) 



+ Y.~a,,fuiY^'\<fC^^^+hY,aj.guiY^'\ul,fc(^^ (2.11e) 
j=i i=i 



The system (2.11) is not in a standard RK formulation. As in (TUl [H] we reformu- 
late this system as a standard partitioned RK method for p. In contrast to [TD] we 
introduce two new variables: 



ft-' 



and pi^):^p^^,+J2^cU), 



(2.12) 



for all i — I, .., s and n — 0, .., N — 1. Then, (2.11 ), yields the equivalent formulation 
of (2.8 1 with coefficients 



Rewriting the resulting backward scheme as a forward scheme then yields (2.8). □ 

Lemma 2.2. Assume the optimal control problem (2.1) and the RK discretization 
scheme (2.3) for the .state equation (2.2a) are given. The discrete state and adjoint 
equation (2.3) and [2.8), respectively, are equivalent to a RK discretization of the 
continuous optimality system (2.2). 



Proof. Note that adding a corresponding set of equalities for p, the system (2.31 
together with (2.8) corresponds to a standard additive RK scheme for the extended 
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(modified) system with p(T) = p{T) and 



y .f{y,u) +g{v,u) 

p = fy{y,uY'P + 9y {y, y-fp 
fy{y,u)'^p + 9y{y,uVp 



p 



(2.13a) 
(2.13b) 
(2.13c) 



The system (2.3) and (2.8| form an additive RK method for (2.13). Since p and p 



have the same initial conditions, we have p = p- Therefore the solution to (2.13b) and 



(2.13c) are equivalent to the solution to (2.2b). Finally, the approximate adjoints ph 



and Ph are the discretized solution to (2.2b) for the same initial equations. □ 

Theorem 2.1 follows directly by Lemma 2.1 and Lemma 2.2 We have a few 
comments on the implications of the previous theorem. 



In view of ( 2 



four RK methods. 



, we obtain four additional RK coefBcient matrices and therefore 
Furthemore, we have the two RK methods for the IMEX dis- 
cretization (2.3 ). If we additionally assume that uji = Qi holds for all « = 1, .., s, then 
we obtain an — 



and Pij — Pij for all i,j = 1, 



., s. 



Hence we obtain only two 
additional RK schemes for the optimality system (2.2). Moreover, the equation is in- 



dependent of the discretization of (2.2a). This implies that either {DOP) or (DOPi) 
can be used to discretize the problem. 



Also, if the Runge-Kutta methods for / and g coincide, i.e. dij = a^- and Wi 



for all i, j, then the discrete adjoint scheme (2.8) coincides with the one derived by 



Hager [TD] and it also coincides with Bonnans et al [S], respectively. We therefore 
recover their results, li g = 0, then an explicit RK method for / in (2.3) yields a 



"backward" explicit method for (2.7) and an implicit method for (2.8). Here, the vari- 
ables and Pn"^ vanish since there is no contribution to p„ and Pn+i, respectively. 
If / = 0, then the backward variant of (2.8) yields a "backward" diagonally implicit 
method for g. 



The system (2.7) is a "backward in time" scheme for the adjoint variable p, with 
initial value j'{yN)- Note that if (2.3) is an implicit-explicit (IMEX) RK scheme with 



a diagonally implicit method (DIRK) for then (2.7) also is an IMEX method with 



a diagonally implicit method for the terms that belong to g. Hence, the presentation 



simplifies in this case which also has been discussed in [T^]. Although (2.8) is more 
suitable for theoretical investigations, for an efhcient implementation of the scheme 



we used formulation (2.7). 

The discretization of equation ( |2.2c ) is straight -forward 



a.feA(yi'=), O^PW + u,g^{Y^!^^\uirPi!^^ = 0. (2.14) 
Next, we prove that for a suitable discretization _ff''(j/„,p„_|_i, it„) of t he H amiltonian 



H{y,p,u) ( 2.14 ) is equal to Vu^H {y 
(2.14) is a valid discretization of Hu{y,p,u) - 
Lemma 2.3. Let 



H''iy„,p„+i,Un) ■=Pn+i 



Un) = 0. Hence, Lemma 
= in the limit /i — >■ 0. 



2.3 



shows that 



with 



with fu 



(k) 



f{Y!C\K) andg^-) ■.= g{Y,^\u'^). Then, 

^u,H'^{yn,Pn+l,U„)=^k{fi'YPr['^+^k{9i'YPi^ 
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Proof. In order to simplify the notation, we introduce the following matrices 
B : s X s block matrix with block entries (i, j) : [aji{fy^Y' + ^ji{9y'^)'^\i hi = 1; 



C : s X 1 block matrix with block entries 
Dfc : 1 X s block matrix with block entries 
Dk : 1 X s block matrix with block entries 



[fijfcld], z = l,..,s 
[ajfeld], z = l,..,s 



1, ..,s 



with f^^ := fy{Y^"\u^) and g^y '' := gy{Yjl"\u^). Moreover, define M := {Id - hB) 
and note that M is invertible if h is sufficiently small. Using the just defined matrices, 
we rewrite (2.11a) as 



MCn = hCpn+i 



where Cn = {{Cn^)'^ , (Cn^)'^)'^ and the equations of (2.12) are 

^kPi^^ = + DkCn and LOk 



Pi''^ = LUkPn+l + DkCn- 



Furthermore, differentiating (2.3) with respect to Uk yields 



(2.15) 



(2.16) 



(2.17) 



{{Yr 



(l)^T 



,., (ri"^)'^)^. Now, we use \2.15\, (|2.16|) and \2.17\ and evaluate 



+1 



where Yn 

the gradient of if''(j/„,p„+i, u„) with respect to 

J2 (^.v„,y«(/«)V+i +^.v„,i;W(g«)^p„+i) + (/!'=)) V+i + c^.(5i'))^P„ 

= Vu,Y„Cpn+l + ^kifi'^'^fpn+l + UJkigi^^fpn+l 

= {hifi'YDk + Hgi'^fDk) M-^Cpn+i + Cj^ifi^Ypn+i + Mgi'YPn+i 

= ^.(/W)^pW+c.,(.9W)^pW. 

□ 

3. Properties of the IMEX Runge-Kutta discretizations. Theoretical 
properties of the derived RK method for system (2.2) corresponding to the parti- 
tioned RK method are considered. 

3.1. Order conditions. We analyse the order conditions for the RK method 



( 2.3 ) together with ( 2.8 ). As in the proof of Lemma 2.2 we add the additional equation 



Pn+l = Pn 



hJ2^^fyiYji'\<fP^'^-hJ2^.gyiY(^\KfP<^^\ (3.1) 



for p such that the resulting method corresponds to a standard additive RK scheme 
for the auxiliary problem (2.13). Since the system is completely coupled, we also 
obtain a similar coupling in the order conditions [17 . 

We start with the analysis of first and second order conditions in the general case, 
i.e. A ^ A and uj ^ uj. We then restrict ourselves to the case w = a;. In this case 
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all additional coupling conditions for first and second order are directly satisfied by 
the order condition for the forward IMEX scheme (2.3). For third order we obtain an 
additional condition. If we consider the adjoint equation alone those conditions have 
been studied in [T2] for a decoupled system (2.3), (2.8). 
Define the coefficients 



i=i i=i j=i i=i 



and 



1=1 



1=1 



1=1 



Proposition 3.1. Consider the additive Runge-Kutta method (2.3) together with 
{2.8) and (3.1) as a discretization scheme for { 2.l3j^ . For (2.3) consider a diagonally 
implicit IMEX-RK method. Then the following results hold true. 

1. The additive method is of first order, if the diagonally implicit IMEX-RK 
method (2.3) is of first order. 

2. The additive method is of second order, if the diagonally implicit IMEX- 
RK method (2.3) is of second order and it additionally satisfies the following 
coupling conditions 



E 



E 



— a., 



E 



E 



i=l ' 1=1 

Proof. The first part of the theorem is trivial. For the second part we prove 



1 

2' 

(3.2) 



E 



" 1 



By the second order of the IMEX-RK method (12. 31) and the definition of 7^ we have 



i=l i=l 

In the same way we prove that 



i=i j=i 



1 E ^3 



s ^ s 

E'^'^' = 9' E 



(5,: 



i=l 



and 



~ 1 

E = 9 



i=l 



hold, if the second order conditions for (2.3) are satisfied. However, since the re- 
maining coupling conditions cannot be simplified in the same way, we need to impose 
further the conditions (3.2). By (3.2 1 we get 



i=l 



1 yj=i j 1=1 



Accordingly, the remaining conditions hold true 



i=i 



and 



1=1 
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□ 

The order conditions apply to all variables y,p and p and therefore y and p satisfy 
in particular equation (2.2). Moreover, if uji = uji for all i = 1, .., s then the additional 
second order conditions (3.2 1 are satisfied. 

Corollary 3.1. If uJi — Cji for all i — 1, .., s and the diagonally implicit IMEX- 
RK method (2.3) is of second order, then the additive RK method (2.3) together with 
(2.8) and (3.1) is of second order. 

Proof. Since, 



and in the same way 



^ ~ 1 " 1 



and 



_ 1 

' ~ 2' 



(3.3) 



the conditions of (3.2) hold. □ 

Theorem 3.1. If uji ^ uji for alii = 1, .., s and the diagonally implicit IMEX-RK 
method 
and Wl 



2.3) is of third order, then the additive RK method (2.3) together with (2.8) 
) is of third order, provided that 

rf? 1 ^e? 1 



E 



E 



and 



E 



uj., 3 



(3.4) 



is satisfied. 

Proof. If UJi = UJi for all i = 1, .., s, it follows that aij = otij and Pij = Pij for all 
i,j = 1, .., s and moreover 7^ — 7^, 5i — Si, di — di and Ci = ii for all i = I, .., s. Now, 
by (3.4) we have 



E-^7f = E 



{uJi - diY^ 



i=l 



J2{uj,-2d, + ^) = \ 



i=l 



and similarly, 



1 '1 



i=l 



i=l 



Furthermore, the third order conditions for (2.3) imply 



S SIS \l^^ 1 

w,Ci7., = E ^^^^i E^'^J ^ ^ "j') = 2 " E E '^^ = 3 

J=l 1=1 \]=1 * / i=l ]=1 



and similarly 

s 1 ^ 

E^'C,7, = -, J2 

i=l 

Therefore, it also holds 

^ujiP.jjj = E^' ( 



uJiCiSi — - and 
o 



" 1 



s s 1 ^ 

7j = ^ - E '^j'^j' E " o " E 



(3.5) 



Wj7jCj 
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and 



and 



For the corresponding coupling conditions we get 



2J 



^ ^ S 1 ^ 



and 



and 



E 



Finally the second set of associated coupling conditions follows by the fact that 

di 



such that 



1 



and 



1 



In analogous way we obtain 



s ^ s ^ s 

ErrE^*«'^-o-E 



dj Cj 



UJn — 

j=l ■> i=l 



2 ^ uJi 



and 



3.2. Symplecticity. Under appropriate conditions the equation (2.2c) can be 
explicitly solved and thus eliminated from the optimality system (2.2c I: Assume that 
locally in the neighbourhood of a critical point u i— Huu{y,u,p) is invertible along 
the trajectory, then by the implicit function theorem, we deduce the existence of a 
function u = (p{y,p) that such that (2.2c) is satisfied [31|S1[T5]. Using the function 
(p{y,p) the associated reduced Hamiltonian system is then 



y = 'Hp{y,p) = f{y, ipiy,p)) + g{y, (p{y,p)) 

p^-'Hv{y,p) = -fy{y,f{y,p))'^p- 9v{y,fiy,p))'^p, 



(3.6a) 
(3.6b) 



where 'H{y,p) := H[y,ip{y,p),p). This system is a Hamiltionian differential equation 
[TTl E]. It has been shown [TT], that in general integration methods that preserve 
the geometric properties such as symplecticity are more suitable to solve Hamiltio- 
nian systems However, for optimal control problems, the advantage of symplectic 
integrators is not as clear, but there are cases where they provide a significant com- 
putational advantage [5]. 

Consider the discrete Hamiltionian system 



yt 



-HpXy,p)i 



Pi = Hy^{y,p) 



(3.7) 
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It is known th at t he flow ij^t generated in the phase space MJ^ x MJ^ of (y,p) by 
the equations (3.7) is symplectic, i.e. it preserves the the differential 2- form = 



'Ylif=i'^yi ^ ^Pi- Here, the differentials dyi and dpi at the stage i are computed as 
derivatives of the flow with respect to the initial data. Preserving the differential 2- 
form is equivalent 11] to ipt * ^^'^ = for the flow ipf As an important consequence 
we have hat the flow is volume and orientation preserving. Numerical methods that 
preserve symplecticity are called symplectic. There exist a variety of papers on sym- 
plectic RK methods for example [fe l [22 l |2T 1 16] . 



with (2.8 1 is symplectic for (3.6 1 



Theorem 3.2 gives conditions such that the discretization scheme (2.3) together 

We assume that u is given by u = ip{y,p) and 
^{Y^\P^,P^^). Here, (f is the corresponding 
implicitly defined function that belongs to (2.131 and we identify (p{y,p,p) = ip(y,p) 
P- 

UJi for all 



(2.14) is locally equivalent to ul- 



Assuming that (2. 14-) holds and provided that uJi 



for all (p, p) with p ; 
Theorem 3.2. 

i = l,..,s, then the discretization scheme (2.3) together with (2.8) is a symplectic 
scheme for (3.6). The proof of this theorem follows along the lines of the proof of 
Theorem 16.6 in [TT]. 



Proof. Consider the discretization scheme ( 2.3 ) together with ( 2.8 ) as a discretiza- 



tion of (3.6). Moreover, to simplify the notation replace the time indices n and n + 1 



by and 1 , respectively and let K be the dimension of i/o or po : respectively. In order 
to prove the symplecticity of our discretization scheme, we need to show that 



K K 

^dpl^Y^ dyl A dpi. 

r— 1 r— 1 



First WG simplify each summand dy[ A dp'i — dy^ A dpQ independently. Using (5.1 ) (see 
appendix), we replace dyl and dp\ and obtain 

s s 

dyl A dpi - dyl A dpi ^ -hY, ^^idyl A dQl) ~ hj^ ^^{dyl A dV^ (3.8a) 

i=l i=l 
s s 

+hJ2'^^idT[ A dpi) + hJ2^^id^ A dpi) (3.8b) 
-h'^ Y {dT[ A dg^^) -h^Y '^''^J (^^'' ^ dV(f I3.8c) 
-h^ Y w»wj (di[ A dQ'j) "h'^Yl '^^'^j{dL\ A dV[\?,.%d) 



i,3 



hi 



Next we replace dyl and dpi by terms resulting from ( 5.1 ) and insert the corresponding 
terms into (3.8), then the rhs of (3.8) simplifies to 



s s 

dyl A dpi - dyl A dpi = -hj^ i^tidY;' A dQl) ~ hj^ ^^tidY-" A dV^) (3.9a) 

i=i 1=1 

s s 

+h Y ^^{dTl A dPl) + hY ^^id^ A dP[) -h^J2 ^'^U'^'^l ^ ^^j) (3-9b) 

-e Y [dLl A dQ]) ^h^Y. ^4 idT[ AdV[)~h'Y (^^^ ^ '^^D (3-9c) 
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Here, the entries of the matrices M , M , M and M are 



Hence, using the coefhcients ctij, aij, (3ij and /3jj all entries of M , M , M"^ and M 
vanish. Using oji — Wi for all i = 1, .., s we compute 



K 



J2idyl A dpi - dyl A dpi) =h 



K 



Y,u:,J2{dY[ A {dQl + dVn) 

1=1 r=l 
s K 

+ E ^(d^r A dP:) + {dL\ A dPi) 



i—l r— 1 



By (5.2l wc then obtain for the rhs of this equation 



K 



'■r=l \ j 



+ I ^f{Y^^\u,)idY:- A dPt) + ^g\Y(^),u,)idY[ A dP/) 



K 



E [^nY(^\n,){dYfAdP[) + ^g^{Y(^\u,) ] {dY/' A dP[) 



Since dY^"- A dYf = -dYf A dY[ and dY[ A dY[ = holds for the exterior product 
(of. |llp. the previous term vanishes and thus 



K 



Y,[Wi^dp\)~{dylAdpl)]=Q. 



□ 

4. Numerical example and implementation. We consider the following prob- 
lem taken from Hager [TU] 



1 

min - / [v^ + 2x'^)dt subject to 
2 Jo 

x{t) = -x{t) + u{t), x{0) ^ 1. 



(4.1) 
(4.2) 



The optimal solution is denoted by {u* , x* ) where 

2(exp(3t) - exp(3)) 



u*{t) 



exp(3i/2)(2 + exp(3))' 



(4.3) 



14 



To illustrate the numerical methods we reformulate the problem as a singularly per- 
turbed differential equation 



min c(l) subject to 

m^\{u^(t)+^\t)+4z^t)), c(0) 
x{t) = z{t) + u{t), x{0) 1 



1 



-^{^x{t) - z{t)), 2(0) = - 



for some e > 0. Note that, 



as e 



0, in_JX4}l) we get z{t) 



substituting into (4.4:) we recover system (4.2). Taking 

'1 



y 



-4z" 



z + u 




9{y) 



(4.4a) 
(4.4b) 
(4.4c) 
(4.4d) 
x{t)/2 and thus 






we obtain a system of the form (2.1 ) for a scalar valued control u{t). We discretize the 
system IMEX methods. We report on results for the second-order L-stable IMEX 



SSP2 scheme (5.1), the second-order globally stiffly accurate IMEX-GSA (5.2) and 



the third-order IMEX-SA3 ( |5.4[ ) scheme 
on [0, 1] with n 



. In all cases we consider an equidistant grid 
In the following we set u = (M„)iy_i where 



1,...,7V gridpointS. unc njuuwing wc ocu u, — \u,njn=l 

u(tn) and similarly for x and c. In order to numerically solve the optimality 

We do not 



conditions for (4.4) it remains to solve equations (2.31,(2.8) and (2.14) 



present the detailed formulas but have some remarks concerning the implementation. 



In view of the stiff source in g it is advantegous to solve instead of (2.3) the 



equivalent system (2.4). As mentioned below Lemma 2.2 the adjoint equation (2.81 



has to be solved backwards in time. It is therefore advantegous to use the equivalent 
formulation (2.7). Note that the discretized terminal condition for the adjoint scheme 
is pm = (1, 0, 0). Furthermore, we assume for simplicity that the control u is the same 



on each stage. Then, equation ( |2.14 ) reads 



(4.5) 



k=l 



and 



where <^ and £, are the solution to (2.7). The optimality system (2. 3), (2. 8), (2.14) is 



solved by a block Gauss-Seidel method: Provided we know u on all grid points we 



solve (2.4) to obtain the state y. Having u and y at hand we can solve (2.7 1 in o rder 



to obtain p and subsequently ^ and ^. However, for an arbitrary u equation (4.5 ) will 
not hold true. We therefore use a nonlinear root finding method F{u) = 0. We set 

s 

F{u) = ||(F„(z.)),lJ^ F„(u) ^Y.f-^^n\nn)~e^+9u{Y,\'\u^)e\ (4-6) 

fe=i 



where F„ ,C« and ^„ are all dependent on u through (2.4 1 and (2.7), respectively. 



Since in the example gy is independent of y we can solve the adjoint equations (2.71 
more efficiently. We use the fact that 



Y:^yiU, + aAt)+Oi{Atr), 
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where p is the order of the exphcit part of the IMEX scheme . Hence, instead of 
computing at every time step in the adjoint equation the values ioi i = 1, . . . , s 
we interpolate the given data {yn)n using a second-and third-order accurate interpo- 
lation, respectively. 

We study the dependence of F{u*) on grid size and value of e for the control 



u* given by (4.3 1. This control is optimal only in the case of e = 0. In order to 
observe the convergence rates we compute a fine grid solution on N — 640 grid points. 
The corresponding states and adjoints are denoted by {c*,x*,z*) and {pl,P2,pl), 
respectively. Note that due to the particular structure of the problem we always 
have pI (t) — 1 and therefore we do not report this quantity in the tables below. We 
denote by F* = {Fn{u*)) and we not necessarily have ||-F*||oo — since the control 
u* is not optimal for the problem (4.4 1 in case e > 0. As in |10j we also report the 



error in the state x(t), and relaxation variable z{t) obtained on a finer grid. 
Convergence results for the IMEX-SA3 and IMEX-GSA are given in Table [53] and 



Table 5.6 respectively. We observe that the convergence properties in the state remain 
independent on e. Since the IMEX-SA3 scheme is not stiffly accurate we loose third- 
order convergence in the relaxation variable z when e is underresolved. Contrary, we 
observe second-order convergence also for small e for the stiffly accurate IMEX-GSA 
scheme. 

Further, we study the convergence behavior of the nonlinear root finding method. 
Initially, we set u{t) = 1 and subsequently solve F{u) = using a standard black- 
box root finding method of Matlab with termination tolerance l.e — 08 on F. Even so 
F{{un)n) is zero up to machine precision in the optimization procedure there remains a 
difference in the computed trajectory (cc„)„ compared with x* . The results for IMEX- 
SSP2 are given in Table 5.7 We observe that the L^— difference between analytical 



and numerically computed trajectory and control decreases with grid size. 

5. Summary and conclusions. We investigated the application of IMEX Runge- 
Kutta methods to optimal control problems. In particulare we focused on order 
conditions and conditions for symplecticity. We studied the adjoint equations and 
established a commutative diagram for optimization and discretization. In particu- 
lar cases previous results for single Runge-Kutta schemes could be recovered [51 [TU]. 
Examples of up to third order IMEX Runge-Kutta methods are given and numerical 
results for a sample problem have been presented. 

Appendix. 



Al. Addenda on the proof of Lemma 2.1 , The Lagrangian function of 



(2.10) is given by 



£(2/,y,p,C)=j(yiv)+P°-(2/o-y") 



E 

n=0 



n ' I 



To simplify the notation in the proof of Theorem |3.2[ the time indices n and 
are replaced by and 1, respectively, for the discrete approximations and p„ and 
they are ommitted for the intermediate states, which are denoted by vectors Yi E M^^, 
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Pi G M or Pi e M , respectively, with i being the index for the stages 1, .., s. In this 
notation the schemes (2.3) and (|2.8p read for i = 1, .., s 



yi^yo + h^U]if{Yi,Ui) + h^ujig{Y„u,) 

1=1 i=l 

and for i — 1, . . . , s 

s s 

Pi =Po -h^a^j fyiYj,UjfPj ~ h^aij gyiY.i,UjfPj 

s s 

p^=Po- hY^hhiy^^^ofp, - hY.NgvO^o^^^fp, 



Pi =Po 



h'^Cbi fy{Yi,Ui)'^ Pi - '^uji gy{Yi,UiY Pi. 



The one-forms dy{ : M^^ ^- M and di;'^ : E^a: ^ ]g ^gg^j ^-j^g ^j-^^f of Theorem [3^ 
are defined by 

zi-^--^ ^^ and 2 '-^ ^7 ^z, 

respectively and similarly also dp^, dP[ and dP[, where y[, p^, Y^ , P[ and P[, 
denote the rth component of the corresponding vector. By differentiation of the 
above equations with respect to (yo,Po) and using the linearity of the differential we 
obtain 

S3 S S 

dY[ ^dyl + hY^ &,jdTJ + hY, a^^dL], dy{ = dy^ + Co,dTl + u,dL\ 

j=l j = l 1=1 i=l 

(5.1a) 

s s s s 

dP[ = dp[,-hY, a^jdQ] -hJ2 a^JdV[, dP[ ^ dp^^ ~ hj^ h^Q] - (3^jdV[ 

(5.1b) 

S S 

dpi ^dpl + h'^ CjdQI + h'^ uj^dVl 

i=l i=l 

(5.1c) 

for i — I, .., s and r — I, .., K , where 

^ a ^ a 

= E g-eriY^,u,)dY/, dL\ = —^g^{Y„u,)dYf, (5.2a) 
i=i y 1=1 y 

^Q^ = EEa5^7/^(^''"')^' + (5.2b) 
j=i 1=1 ^ ^ j=i ^ 

'^^^^ = EEa5^5^(^-"')^' dYl + j2^^9^{Y..uMP!. (5.2c) 

7 = 1 1 = 1 ^ ^ 7 = 1 ^ 
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A2. Examples of IMEX schemes. We use the following convention for the 
names of the schemes: Name(fc, a^, cri) where k is the order, aE the number of levels 
in the explicit scheme and ai the number of levels in the implicit scheme. 

A two stage second order IMEX method, where the implicit part is L-stable, is 
given by Pareschi and Russo in 19J. Since Cj — uj the second order conditions for the 












7 


7 





(i,^): 1 


1 





{A,uj): 1-7 


1-27 


7 




1/2 


1/2 




1/2 


1/2 



Table 5.1 

IMEX-SSP2{2, 2, 2) is a second-order IMEX scheme. The factor 7 is given by -) = I - l/\/2. 



additive RK scheme are directly satisfied. To avoid loss of accuracy in stiff problems. 



in order to compute a globally stiffly accurate method {hs 



and 



] 



1, . . . , s see [S]), we are forced to take Cj ^ uj, 



UJv 



and impose the additional 
second order conditions (3.2). In this case at least 4 levels are required. An example 



is IMEX-GSA(2,3,4) reported below 


















1/2 


1/2 











3/2 


3/2 











5/4 


3/4 


1/2 








1/2 


5/6 


-1/3 








{A,uj): 1/4 


-1/4 





1/2 





1 


1/3 


1/6 


1/2 





1 


1/6 


-1/6 


1/2 


1/2 




1/3 


1/6 


1/2 







1/6 


-1/6 


1/2 


1/2 



Table 5.2 

/MEJf-GS'j4(2, 3, 4) is a second order globally stiffly accurate scheme. 



Moreover, it can be shown that for a) = a; no three stages IMEX-RK that satisfies 



the additional third-order conditions in Theorem 3.1 with A-stable implicit integrator 
exist. Here, we report three stage third order IMEX-RK method such that the explicit 
scheme corresponds to the third-order scheme given in [TO] . 












1/2 


1/2 





1 


-1 2 







1/6 2/3 


1/6 



[Auj): 















1/2 


1/4 


1/4 





1 





1 







1/6 


2/3 


1/6 



Table 5.3 

IMEX-HAG{3,3,3) is a third-order IMEX scheme, where {A,lu) corresponds to the third-order 
scheme given by Hager in ilOf . 



Finally we present a third order scheme which uses 4 levels in order to achieve 
better stability properties in the implicit integrator. 

Note that all IMEX schemes of type ARS have lui = 0, such that they do not 
satisfy the condition cD 7^ and cj 7^ 0. However if they are of the particular structure 
of [I] with LUj for j ^ 1 and w 7^ 0, then we can still find a variable transformation 
such that the conclusion of Proposition 2.1 holds (see Remark in Section [2]). 
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2/3 


2/3 











2/3 


-1/3 


1 








1 


3/4 


1/4 








{A,uj): 1 


-1/4 


1/4 


1 





1 


1/4 


3/4 








1 


1/4 


3/4 


-1/2 


1/2 




1/4 


3/4 


-1/2 


1/2 




1/4 


3/4 


-1/2 


1/2 



Table 5.4 

IMEX-SA{3, 4, 4) is a four stages, third order IMEX scheme. 
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N 


e 


\\Fn{Un)n\\'i 


lice* - (a;„^ 


n 1 OO 


\\u* - (Un 


n II OO 


10 


Oe+00 


9.3435e-16 


1.1661e-01 


(0.00) 


8.1212e-02 


(0.00) 


20 


Oe+00 


5.7291e-15 


5.4444e-02 


(2.14) 


3.5548e-02 


(2.28) 


40 


Oe+00 


2.1384e-14 


2.5674e-02 


(2.12) 


1.6495e-02 


(2.16) 


80 


Oe+00 


4.8134e-13 


1.1822e-02 


(2.17) 


7.9952e-03 


(2.06) 


160 


Oe+00 


5.2921e-13 


5.0243e-03 


(2.35) 


3.7065e-03 


(2.16) 


320 


Oe+00 


4.9687e-ll 


1.6345e-03 


(3.07) 


1.4217e-03 


(2.61) 


10 


le-02 


9.2474e-16 


1.1649e-01 


(0.00) 


8.0766e-02 


(0.00) 


20 


le-02 


4.6022e-15 


5.4428e-02 


(2.14) 


3.5485e-02 


(2.28) 


40 


le-02 


1.6138e-13 


2.5673e-02 


(2.12) 


1.64286-02 


(2.16) 


80 


le-02 


4.0505e-13 


1.1818e-02 


(2.17) 


7.9385e-03 


(2.07) 


160 


le-02 


4.9368e-13 


5.0148e-03 


(2.36) 


3.66116-03 


(2.17) 


320 


le-02 


2.1669e-ll 


1.6241e-03 


(3.09) 


1.43366-03 


(2.55) 


10 


le-01 


8.7718e-15 


1.1655e-01 


(0.00) 


7.96576-02 


(0.00) 


20 


le-01 


6.4816e-15 


5.4631e-02 


(2.13) 


3.55086-02 


(2.24) 


40 


le-01 


2.9944e-14 


2.5813e-02 


(2.12) 


1.62186-02 


(2.19) 


80 


le-01 


5.3096e-14 


1.1900e-02 


(2.17) 


7.69486-03 


(2.11) 


160 


le-01 


1.0236e-13 


5.0614e-03 


(2.35) 


3.43656-03 


(2.24) 


320 


le-01 


4.4261e-ll 


1.6702e-03 


(3.03) 


1.14346-03 


(3.01) 


10 


lc+00 


3.6848e-15 


1.0539e-01 


(0.00) 


6.79256-02 


(0.00) 


20 


lc+00 


9.3571e-16 


4.9678e-02 


(2.12) 


3.09246-02 


(2.20) 


40 


le-l-00 


3.8451e-16 


2.3511e-02 


(2.11) 


1.4248e-02 


(2.17) 


80 


le-t-00 


2.5252e-13 


1.0821e-02 


(2.17) 


6.39996-03 


(2.23) 


160 


le-hOO 


3.9397e-12 


4.5746e-03 


(2.37) 


2.58576-03 


(2.48) 


320 


le-hOO 


1.9792e-12 


1.4744e-03 


(3.10) 


7.37456-04 


(3.51) 



Table 5.7 



u* is the numerically obtained control 07^ on a grid with N = 640 mesh points after com- 
putation of F(u*) = 0. with initial value u{t) = 1 for the Gauss-Seidel iteration. is the 
numerically obtained control after succesfull computation of F{u) = with the same initial value 
for the Gauss-Seidel iteration. The corrresponding states are x* and {x„)„, respectively, and they 
are the discrete solution to Note that in order to compute -F((u„)n) the adjoint states have 
obtained through Jg.gp using IMEX SSP2. The value in the brackets is the residual of the current 
divided by the residual of the previous result. 



